logo of company

Bioinformatics pipeline summary


Where we see the pipeline processes

Author: Adrien Taudière

Date: October 28, 2024

Summary of the bioinformatic pipeline

Code
library(knitr)
library(targets)
library(MiscMetabar)
here::i_am("analysis/01_bioinformatics.qmd")
Code
d_pq <- tar_read("d_vs", store=here::here("_targets/"))
Code
tar_glimpse(script=here::here("_targets.R"), targets_only = TRUE, callr_arguments = list(show = FALSE))

Load phyloseq object from targets store

Code
d_pq <- tar_read("d_vs", store=here::here("_targets/"))

The {targets} package is at the core of this project. Please read the intro of the user manual if you don’t know {targets}.

The {targets} package store … targets in a folder and can load (tar_load()) and read (tar_read) object from this folder.

Sample data

Code
DT::datatable(d_pq@sam_data)

Sequences, samples and clusters across the pipeline

Code
formattable_pq(
    d_pq,
    "Type",
    min_nb_seq_taxa = 1000,
    taxonomic_levels=c("Order", "Family", "Genus"),
    log10trans = TRUE
  )
Cleaning suppress 0 taxa (  ) and 0 sample(s) (  ).
Number of non-matching ASV 0
Number of matching ASV 1147
Number of filtered-out ASV 1135
Number of kept ASV 12
Number of kept samples 2
Cleaning suppress 0 taxa and 0 samples.
Joining with `by = join_by(OTU)`
OTU Order Family Genus Mono proportion_samp nb_seq
Taxa_2 Glomerales Claroideoglomeraceae Claroideoglomus 4.36 1 4.36
Taxa_12 Glomerales NA NA 3.85 1 3.85
Taxa_21 NA NA NA 3.64 1 3.64
Taxa_3 Glomerales Glomeraceae Glomus 3.61 1 3.61
Taxa_5 Glomerales Glomeraceae Glomus 3.48 1 3.48
Taxa_17 Glomerales Glomeraceae Glomus 3.42 1 3.42
Taxa_15 NA NA NA 3.39 1 3.39
Taxa_13 Glomerales Claroideoglomeraceae Claroideoglomus 3.38 1 3.38
Taxa_23 Glomerales Claroideoglomeraceae Claroideoglomus 3.32 1 3.32
Taxa_19 NA NA NA 3.23 1 3.23
Taxa_41 Glomerales Glomeraceae Glomus 3.14 1 3.14
Taxa_42 Glomerales NA NA 3.05 1 3.05

Session Information

Session information are detailed below. More information about the machine, the system, as well as python and R packages, are available in the file data_final/information_run.txt .

Code
sessionInfo()
R version 4.4.1 (2024-06-14)
Platform: x86_64-pc-linux-gnu
Running under: Debian GNU/Linux 12 (bookworm)

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.11.0 
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.11.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=fr_FR.UTF-8        LC_COLLATE=fr_FR.UTF-8    
 [5] LC_MONETARY=fr_FR.UTF-8    LC_MESSAGES=fr_FR.UTF-8   
 [7] LC_PAPER=fr_FR.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=fr_FR.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/Paris
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices datasets  utils     methods   base     

other attached packages:
[1] MiscMetabar_0.10.1 purrr_1.0.2        dplyr_1.1.4        dada2_1.32.0      
[5] Rcpp_1.0.13        ggplot2_3.5.1      phyloseq_1.48.0    targets_1.8.0     
[9] knitr_1.48        

loaded via a namespace (and not attached):
  [1] bitops_1.0-9                deldir_2.0-4               
  [3] gridExtra_2.3               permute_0.9-7              
  [5] rlang_1.1.4                 magrittr_2.0.3             
  [7] ade4_1.7-22                 matrixStats_1.4.1          
  [9] compiler_4.4.1              mgcv_1.9-1                 
 [11] png_0.1-8                   callr_3.7.6                
 [13] vctrs_0.6.5                 reshape2_1.4.4             
 [15] stringr_1.5.1               pwalign_1.0.0              
 [17] pkgconfig_2.0.3             crayon_1.5.3               
 [19] fastmap_1.2.0               backports_1.5.0            
 [21] XVector_0.44.0              utf8_1.2.4                 
 [23] Rsamtools_2.20.0            rmarkdown_2.28             
 [25] UCSC.utils_1.0.0            ps_1.8.0                   
 [27] xfun_0.48                   cachem_1.1.0               
 [29] zlibbioc_1.50.0             GenomeInfoDb_1.40.1        
 [31] jsonlite_1.8.9              biomformat_1.32.0          
 [33] highr_0.11                  rhdf5filters_1.16.0        
 [35] DelayedArray_0.30.1         Rhdf5lib_1.26.0            
 [37] BiocParallel_1.38.0         jpeg_0.1-10                
 [39] parallel_4.4.1              cluster_2.1.6              
 [41] R6_2.5.1                    bslib_0.8.0                
 [43] RColorBrewer_1.1-3          stringi_1.8.4              
 [45] jquerylib_0.1.4             GenomicRanges_1.56.2       
 [47] SummarizedExperiment_1.34.0 iterators_1.0.14           
 [49] IRanges_2.38.1              Matrix_1.7-0               
 [51] splines_4.4.1               igraph_2.1.1               
 [53] tidyselect_1.2.1            viridis_0.6.5              
 [55] rstudioapi_0.17.1           abind_1.4-8                
 [57] yaml_2.3.10                 vegan_2.6-8                
 [59] codetools_0.2-20            hwriter_1.3.2.1            
 [61] processx_3.8.4              lattice_0.22-6             
 [63] tibble_3.2.1                plyr_1.8.9                 
 [65] Biobase_2.64.0              withr_3.0.1                
 [67] ShortRead_1.62.0            evaluate_1.0.1             
 [69] survival_3.7-0              RcppParallel_5.1.9         
 [71] formattable_0.2.1           Biostrings_2.72.1          
 [73] pillar_1.9.0                BiocManager_1.30.25        
 [75] MatrixGenerics_1.16.0       DT_0.33                    
 [77] renv_1.0.11                 foreach_1.5.2              
 [79] stats4_4.4.1                generics_0.1.3             
 [81] rprojroot_2.0.4             S4Vectors_0.42.1           
 [83] munsell_0.5.1               scales_1.3.0               
 [85] base64url_1.4               glue_1.8.0                 
 [87] tools_4.4.1                 interp_1.1-6               
 [89] data.table_1.16.2           GenomicAlignments_1.40.0   
 [91] visNetwork_2.1.2            rhdf5_2.48.0               
 [93] grid_4.4.1                  tidyr_1.3.1                
 [95] ape_5.8                     crosstalk_1.2.1            
 [97] latticeExtra_0.6-30         colorspace_2.1-1           
 [99] nlme_3.1-165                GenomeInfoDbData_1.2.12    
[101] cli_3.6.3                   fansi_1.0.6                
[103] viridisLite_0.4.2           S4Arrays_1.4.1             
[105] gtable_0.3.5                sass_0.4.9                 
[107] digest_0.6.37               BiocGenerics_0.50.0        
[109] SparseArray_1.4.8           htmlwidgets_1.6.4          
[111] htmltools_0.5.8.1           multtest_2.60.0            
[113] lifecycle_1.0.4             here_1.0.1                 
[115] httr_1.4.7                  secretbase_1.0.3           
[117] MASS_7.3-61                

Citation

BibTeX citation:
@online{taudière2024,
  author = {Taudière, Adrien},
  title = {Bioinformatics Pipeline Summary},
  date = {2024-10-28},
  langid = {en}
}
For attribution, please cite this work as:
Taudière, Adrien. 2024. “Bioinformatics Pipeline Summary.” October 28, 2024.